منابع مشابه
ntHash: recursive nucleotide hashing
MOTIVATION Hashing has been widely used for indexing, querying and rapid similarity search in many bioinformatics applications, including sequence alignment, genome and transcriptome assembly, k-mer counting and error correction. Hence, expediting hashing operations would have a substantial impact in the field, making bioinformatics applications faster and more efficient. RESULTS We present n...
متن کاملRecursive Similarity Hashing of Fractal Geometry
A new technique of topological multi-scale analysis is introduced. By performing a clustering recursively to build a hierarchy, and analyzing the co-scale and intra-scale similarities, an Iterated Function System can be extracted from any data set. The study of fractals shows that this method is efficient to extract self-similarities, and can find elegant solutions the inverse problem of buildi...
متن کاملRecursive n-gram hashing is pairwise independent, at best
Many applications use sequences of n consecutive symbols (n-grams). Hashing these n-grams can be a performance bottleneck. For more speed, recursive hash families compute hash values by updating previous values. We prove that recursive hash families cannot be more than pairwise independent. While hashing by irreducible polynomials is pairwise independent, our implementations either run in time ...
متن کاملRecursive Hashing and One-Pass, One-Hash n-Gram Count Estimation
Many applications use sequences of n consecutive symbols (n-grams). We review n-gram hashing and prove that recursive hash families are pairwise independent at best. We prove that hashing by irreducible polynomials is pairwise independent whereas hashing by cyclic polynomials is quasi-pairwise independent: we make it pairwise independent by discarding n− 1 bits. One application of hashing is to...
متن کاملUniversal Hashing and Perfect Hashing
Each of the key values x comes from a universe U , i.e. x ∈ U . In this document, we assume U = {1, 2, . . . N}. Observe that the set S is a dynamic set. Each of the Insert and Delete operations may modify the set. Hence the size of the set S changes with each operation. We bound the maximum size of the set to n (n << N). What are the data structures that can be used to store the set S? One opt...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Bioinformatics
سال: 2016
ISSN: 1367-4803,1460-2059
DOI: 10.1093/bioinformatics/btw397